Warning: file_put_contents(aCache/aDaily/post/opendatascience/-2330-2331-): Failed to open stream: No space left on device in /var/www/tg-me/post.php on line 50
Data Science by ODS.ai 🦜 | Telegram Webview: opendatascience/2330 -

Telegram Group & Telegram Channel

Data Science by ODS.ai 🦜

⚙️ SWE-rebench: Nebius AI R&D team presents new dataset for SWE tasks.

Researchers built an automated system to collect and validate thousands of real-world tasks from GitHub, designed for training and evaluation of LLMs in software engineering.

Main features of the system:
1️⃣ Automatic data collection: Continuously extracts issue-PR pairs from Python repositories.
2️⃣ LLM-based environment setup: LLM analyzes repositories, creates install instructions, and updates them if errors happen.
3️⃣ Execution-based validation: Each task is tested by automatic setup, test run, and dependency freezing to make it reproducible.
4️⃣ LLM quality annotation: Tasks are labeled for clarity, difficulty, and test correctness to support filtering.

Result:
SWE-rebench dataset: 21,000+ ready-to-use interactive tasks.
Continuous updates: Fresh data is added regularly.
Transparent evaluation: Tasks are used for public SWE-rebench leaderboard.

🚀 SWE-rebench gives researchers and developers real and validated tasks to work with LLMs in SWE field.

Technical report: arXiv
Dataset: SWE-rebench

www.tg-me.com/ms/Data Science by ODS ai 🦜/com.opendatascience/2330

2.0K viewsMay 29 at 15:03

tg-me.com/opendatascience/2330

Create: 2025-05-29
Last Update: 2025-06-01 05:27:27

⚙️ SWE-rebench: Nebius AI R&D team presents new dataset for SWE tasks.

Researchers built an automated system to collect and validate thousands of real-world tasks from GitHub, designed for training and evaluation of LLMs in software engineering.

Main features of the system:
1️⃣ Automatic data collection: Continuously extracts issue-PR pairs from Python repositories.
2️⃣ LLM-based environment setup: LLM analyzes repositories, creates install instructions, and updates them if errors happen.
3️⃣ Execution-based validation: Each task is tested by automatic setup, test run, and dependency freezing to make it reproducible.
4️⃣ LLM quality annotation: Tasks are labeled for clarity, difficulty, and test correctness to support filtering.

Result:
SWE-rebench dataset: 21,000+ ready-to-use interactive tasks.
Continuous updates: Fresh data is added regularly.
Transparent evaluation: Tasks are used for public SWE-rebench leaderboard.

🚀 SWE-rebench gives researchers and developers real and validated tasks to work with LLMs in SWE field.

Technical report: arXiv
Dataset: SWE-rebench

BY Data Science by ODS.ai 🦜

Share with your friend now:
tg-me.com/opendatascience/2330

Open in Telegram

Data Science by ODS ai 🦜 Telegram | DID YOU KNOW?

Date: 2025-06-01| Data Science by ODS ai 🦜

Pinterest (PINS) Stock Sinks As Market Gains

Pinterest (PINS) closed at $71.75 in the latest trading session, marking a -0.18% move from the prior day. This change lagged the S&P 500's daily gain of 0.1%. Meanwhile, the Dow gained 0.9%, and the Nasdaq, a tech-heavy index, lost 0.59%. Heading into today, shares of the digital pinboard and shopping tool company had lost 17.41% over the past month, lagging the Computer and Technology sector's loss of 5.38% and the S&P 500's gain of 0.71% in that time. Investors will be hoping for strength from PINS as it approaches its next earnings release. The company is expected to report EPS of $0.07, up 170% from the prior-year quarter. Our most recent consensus estimate is calling for quarterly revenue of $467.87 million, up 72.05% from the year-ago period.

Data Science by ODS ai 🦜 from ms

Warning: filemtime(): stat failed for aCache/aDaily/post/opendatascience/-2330-2331- in /var/www/tg-me/post.php on line 333

Warning: filemtime(): stat failed for aCache/aDaily/post/opendatascience/-2330-2331- in /var/www/tg-me/post.php on line 334

⚙️ SWE-rebench: Nebius AI R&D team presents new dataset for SWE tasks.Researchers built an automated system to collect and validate thousands of real-world tasks from GitHub

Data Science by ODS.ai 🦜 TG
Webview: 2330
Data Science by ODS.ai 🦜.Telegram Webview
Data Science by ODS.ai 🦜 Telegram TG Channel
Telegram Updated: 1970-01-01 00:00:00

Telegram Data Science by ODS.ai 🦜
FROM USA